Prediction of Protein Retention Times in Anion-Exchange Chromatography Systems Using Support Vector Regression

نویسندگان

  • Minghu Song
  • Curt M. Breneman
  • Jinbo Bi
  • Nagamani Sukumar
  • Kristin P. Bennett
  • Steven M. Cramer
  • Nihal Tugcu
چکیده

Quantitative Structure-Retention Relationship (QSRR) models are developed for the prediction of protein retention times in anion-exchange chromatography systems. Topological, subdivided surface area, and TAE (Transferable Atom Equivalent) electron-density-based descriptors are computed directly for a set of proteins using molecular connectivity patterns and crystal structure geometries. A novel algorithm based on Support Vector Machine (SVM) regression has been employed to obtain predictive QSRR models using a two-step computational strategy. In the first step, a sparse linear SVM was utilized as a feature selection procedure to remove irrelevant or redundant information. Subsequently, the selected features were used to produce an ensemble of nonlinear SVM regression models that were combined using bootstrap aggregation (bagging) techniques, where various combinations of training and validation data sets were selected from the pool of available data. A visualization scheme (star plots) was used to display the relative importance of each selected descriptor in the final set of "bagged" models. Once these predictive models have been validated, they can be used as an automated prediction tool for virtual high-throughput screening (VHTS).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of selectivity in multimodal anion exchange systems: a priori prediction of protein retention and examination of mobile phase modifier effects.

Although recent advances in multimodal chromatography have shown significant potential for selective protein purification, there is a need to establish a deeper understanding of the nature of selectivity in these systems. In this work, the adsorption behavior of a library of commercially available proteins with varying physicochemical properties was investigated. Linear gradient experiments wer...

متن کامل

Prediction of the effect of mobile-phase salt type on protein retention and selectivity in anion exchange systems.

This study examines the effect of different salt types on protein retention and selectivity in anion exchange systems. Particularly, linear retention data for various proteins were obtained on two structurally different anion exchange stationary-phase materials in the presence of three salts with different counterions. The data indicated that the effects are, for the most part, nonspecific, alt...

متن کامل

Prediction of soil cation exchange capacity using support vector regression optimized by genetic algorithm and adaptive network-based fuzzy inference system

Soil cation exchange capacity (CEC) is a parameter that represents soil fertility. Being difficult to measure, pedotransfer functions (PTFs) can be routinely applied for prediction of CEC by soil physicochemical properties that can be easily measured. This study developed the support vector regression (SVR) combined with genetic algorithm (GA) together with the adaptive network-based fuzzy infe...

متن کامل

A machine learning approach for the prediction of DNA and peptide HPLC retention times

Here we present a method for prediction of HPLC retention times based on support vector regression. In contrast to existing prediction methods for DNA, our method takes the secondary structure of DNA into account. The method is also well suited for retention time prediction of peptides.

متن کامل

Stock Price Prediction using Machine Learning and Swarm Intelligence

Background and Objectives: Stock price prediction has become one of the interesting and also challenging topics for researchers in the past few years. Due to the non-linear nature of the time-series data of the stock prices, mathematical modeling approaches usually fail to yield acceptable results. Therefore, machine learning methods can be a promising solution to this problem. Methods: In this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and computer sciences

دوره 42 6  شماره 

صفحات  -

تاریخ انتشار 2002